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ABSTRACT 

This  report  examines  the  provision  of  a  high  performance  memory  system 
for  an  electrically  addressed  spatial  light  modulator  (SLM),  destined  for  use 
in  an  optical  correlator.  Two  distinct  fast  memory  system  designs  are  pro¬ 
posed.  One  design  is  advanced  to  the  stage  of  specification  of  major  elements 
of  its  architecture  and  timing  signals.  The  other  design  is  developed  to  the 
conceptual  stage.  Appendices  review  recent  high  performance  semiconductor 
memory  technology. 
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Memory  design  for  electrically  addressed  spatial  light 

modulators 


EXECUTIVE  SUMMARY 

Real  time  automatic  target  recognition  (ATR)  is  an  operation  that  is  of  importance 
in  the  detection  and  recognition  of  military  platforms  of  various  types.  In  particular,  this 
technology  could  facilitate  the  recognition  of  land  vehicle  targets  in  the  northern  Australian 
environment,  a  capability  that  has  been  identified  by  the  ADF  as  an  important  element  of 
focal  area  reconnaissance  (reconnaissance  within  a  100  km  radius  around  sites  of  strategic 
importance  in  a  military  conflict).  The  research  reported  here  was  conducted  under  the 
sponsorship  of  Directorate  Combat  Force  Development  (Land),  and  examines  one  specific 
technology  of  use  in  some  types  of  ATR  systems  that  operate  on  optical  imagery. 

A  spatial  light  modulator  (SLM)  is  a  specialised  type  of  two  dimensional  liquid  crystal 
display  (as  used  in  portable  televisions,  digital  watches  and  laptop  computers).  SLMs 
are  the  critical  component  of  optical  correlators,  which  are  instruments  for  detecting  the 
occurrence  of  specific  small  targets  within  a  larger  image,  which  is  the  critical  operation 
of  ATR.  There  exists  a  spatial  filter  pattern  tuned  to  any  conceivable  target,  for  which 
the  optical  correlator  experiences  an  acute  response  when  that  spatial  filter  pattern  is 
displayed  on  its  SLM.  An  optical  correlator  maintains  a  large  ensemble  of  spatial  filters 
in  memory,  which  enables  it  to  detect  an  equally  large  variety  of  target  views  in  the  input 
image.  For  every  input  image,  the  optical  correlator  must  cycle  through  all  of  its  stored 
target  filters,  retrieving  each  one  from  memory  and  writing  it  to  the  filter  SLM. 

For  real  time  operation  there  is  a  time  limit  within  which  the  filter  cycling  must 
occur.  The  shorter  the  cycle  time  per  filter,  the  more  filters  that  can  be  tried,  and  the 
more  robust  the  target  recognition  capability.  This  report  examines  the  design  of  high 
speed  memory  systems  for  SLMs,  such  an  endeavour  being  necessary  for  the  attainment 
of  better  performing  optical  correlator  based  ATR  systems.  Two  alternative  functional 
designs  for  high  speed  SLM  memory  systems  are  proposed  and  conceptually  developed  as 
far  as  appropriate  for  the  aims  of  this  task.  Appendices  review  characteristics  of  the  latest 
high  performance  memory  technology,  from  the  perspective  of  SLM  memory  requirements. 

The  outcomes  of  this  investigation  are  twofold.  Most  concrete  is  the  presentation  of 
two  design  concepts  for  high  performance  SLM  memory  systems.  Such  advanced  memory 
architectures  will  be  very  beneficial  to  the  performance  of  optical  correlators  that  may  be 
of  interest  to  defence  forces,  yet  at  present  these  sophisticated  memories  are  not  available 
in  any  commercial  optical  correlators.  Another  outcome  is  the  accumulation  by  DSTO 
researchers  of  a  degree  of  technical  knowledge  about  SLM  memory  systems,  that  will  be 
of  assistance  if  DSTO  is  called  upon  to  objectively  assess  proposals  for  the  use  of  optical 
correlators  by  the  ADF. 
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1  Introduction 


SLMs1  are  indispensable  components  of  optical  correlators.  The  input  image  within 
which  a  pattern  is  sought  is  displayed  on  one  SLM,  and  spatial  filters  tuned  to  individual 
patterns  are  sequentially  scanned  on  another  SLM.  The  ensemble  of  spatial  filters  would 
typically  be  quite  large,  so  for  real  time  pattern  recognition  it  is  imperative  that  filters  can 
be  written  onto  the  SLM  very  quickly.  The  achievable  rate  of  SLM  updating  is  determined 
by  either  the  filter  memory  system  data  throughput,  or  the  physical  limits  of  the  SLM 
response  to  input  data  transitions.  This  report  examines  the  former  factor. 

Only  electrically  addressed  SLMs  will  be  considered  in  this  report,  so  the  SLM  memory 
system  is  a  digital  electronic  one.  The  majority  of  the  report  is  devoted  to  the  presentation 
and  discussion  of  two  very  different  propositions  for  future  high  speed  SLM  memories. 
A  functional  design  will  be  provided  for  a  synchronous  DRAM  memory  system,  together 
with  an  explanation  of  its  operation.  The  other  memory  system,  based  on  an  SRAM  cache 
memory  on  the  SLM  chip,  will  be  considered  only  at  the  conceptual  level.  In  formulating 
the  more  general  aspects  of  the  discussion  of  semiconductor  memory,  I  referenced  the 
authoritative  book  by  Prince  [1],  the  series  of  special  reports  edited  by  Comerford  and 
Watson  [2],  and  the  tutorial  by  Prince  [3]. 

Although  the  memory  designs  developed  in  this  report  are  in  principle  compatible  with 
any  active  backplane  SLM,  for  definiteness  I  will  be  specifically  assuming  FLC  SLMs.  An 
exposition  of  a  variety  of  SLM  technologies,  including  FLC  SLMs,  is  contained  in  the 
compilation  edited  by  Efron  [4], 

Of  SRAM  and  DRAM,  precedence  dictates  that  SLM  main  memory  in  optical  correla¬ 
tors  should  be  constructed  of  DRAM  components,  just  as  it  is  in  most  other  large  digital 
systems.  This  is  because  SLM  main  memory  must  be  very  large  to  store  the  requisite 
number  of  filters.  The  lower  power  consumption,  higher  density,  lower  price  and  better 
availability  of  DRAMs  compared  with  SRAMs  outweigh  the  advantages  that  SRAMs  have 
over  DRAMs. 

Before  embarking  upon  a  consideration  of  the  DRAM  technology  that  I  believe  has 
the  best  prospects  for  high  performance  SLM  memory,  the  reader  may  wish  to  review  the 
appendices  to  obtain  a  broader  perspective  of  DRAM  technology  in  the  context  of  SLM 
memory  usage.  Appendix  A  alerts  readers  to  the  emerging  ‘killer  application’  for  DRAMs 
in  consumer  electronics,  where,  propitiously,  the  requirements  on  memory  performance  are 
quite  similar  to  those  for  SLM  memories.  Appendix  B  briefly  reviews  many  types  of  high 
performance  DRAMs,  including  appraisals  of  their  suitability  for  use  in  SLM  memories. 
Some  of  the  memory  terminology  used  in  the  appendices  is  defined  in  the  footnotes  of 
the  main  body  of  the  report.  Reading  of  the  appendices  may  be  forgone  without  loss  of 
continuity. 


1See  the  Glossary  on  page  19  for  the  expansion  of  all  acronyms  used  in  this  report,  together  with  brief 
explanations. 
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2  Synchronous  DRAM 


Conventional  DRAMs  operate  asynchronously,  that  is,  the  integrated  circuit  is  not 
paced  by  an  externally  supplied  system  clock  (DRAMs  may  have  their  own  chip  level 
internal  clock — it  is  the  absence  of  synchronisation  with  the  global  system  clock  that 
renders  them  asynchronous).  Superficially  this  seems  to  be  the  modus  operandi  best 
suited  to  achieving  the  physical  limits  of  circuit  speed,  since  the  component  can  respond 
immediately  to  input  logic  transitions,  without  waiting  for  the  next  system  clock  transition 
to  occur.  However,  this  reasoning  ignores  the  finite  time  taken  for  the  circuit  to  complete 
its  response,  and  be  available  to  be  prompted  for  another  response.  The  response  time 
is  a  latent  period  in  the  operation  of  the  circuit,  because  any  input  during  this  period  is 
ignored  and  lost,  or  at  best,  ignored  and  stored  for  later  response.  Most  importantly,  this 
response  time  usually  is  much  longer  than  the  time  period  needed  for  a  new  signal  to  be 
issued  by  the  circuit  module  that  is  interacting  with  the  memory.  The  two  interacting 
modules  are  not  synchronised  with  one  another,  so  that  time  will  always  elapse  between 
one  memory  response  being  completed  and  the  next  response  being  prompted.  Therefore, 
in  practice  asynchronous  operation  of  digital  systems  does  not  approach  the  physical  limits 
of  circuit  speed. 

Accordingly,  it  is  possible  that  electronic  systems  containing  modules  with  very  dif¬ 
ferent  response  times  may  operate  more  quickly  if  they  are  globally  synchronous,  that  is, 
they  are  paced  by  a  universal  system  clock.  The  argument  advanced  is  that  synchronous 
systems  operate  with  complete  certainty  about  the  clock  cycle  in  which  response  signals 
become  available,  so  that  no  time  need  be  wasted  in  polling  and  waiting  for  incoming 
signals.  The  time  that  an  asynchronous  system  would  spend  on  polling  and  waiting,  a 
synchronous  system  would  utilise  to  execute  useful  operations.  Also,  a  synchronous  system 
always  will  read  an  output  within  one  clock  cycle  of  it  becoming  available,  whereas  there 
is  no  upper  time  limit  within  which  an  asynchronous  system  must  read  outputs.  This 
reasoning  motivated  the  development  of  synchronous  SRAMs  quite  some  time  ago,  and  it 
has  become  conventional  wisdom  that  synchronous  SRAMs  are  superior  to  asynchronous 
SRAMs  in  fast  applications. 

More  recently,  synchronous  DRAMs  have  emerged  from  fabricators  such  as  Texas  In¬ 
struments,  Micron  Technology,  Samsung,  Hitachi,  Toshiba,  Mitsubishi,  NEC,  Fujitsu  and 
Oki.  Synchronous  DRAM  is  a  nonproprietary  concept  specified  by  the  Electronic  In¬ 
dustries  Association/ Joint  Electron  Devices  Engineering  Council  (EIA/Jedec)  JC  42.3 
DRAM  Standards  Committee.  The  semiconductor  industry  in  general  seems  to  regard 
synchronous  DRAM  as  the  most  prospective  contender  for  the  future  standard  for  high 
performance  DRAMs.  This  fact  reduces  the  technology  risks  in  basing  the  SLM  mem¬ 
ory  design  on  a  synchronous  DRAM  architecture.  Current  generation  SLM  backplanes 
already  operate  synchronously,  so  there  should  not  be  any  logic  design  problems  in  inter¬ 
facing  SLMs  to  synchronous  DRAM  systems.  If  anything,  the  ‘glue’  logic  actually  should 
be  simpler,  because  of  the  absence  of  the  need  to  accommodate  asynchronous  ‘artifacts,’ 
such  as  wait  states. 
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The  synchronous  DRAM  design  is  optimised  for  sequential  access2  rather  than  random 
access3,  which  is  entirely  consistent  with  usage  of  the  SLM  memory  in  optical  correlators, 
in  which  spatial  filters  stored  in  memory  are  sequentially  transferred  to  the  SLM  without 
any  manipulation.  Some  of  the  design  techniques  used  to  maximise  the  speed  of  operation 
of  synchronous  DRAMs  are  well  established.  One  is  address  pipelining,  in  which  processing 
of  the  new  address  is  begun  before  completion  of  the  old  address  processing,  thus  reducing 
the  effective  cycle  time4  by  overlapping  cycles.  Another  is  memory  bank5  interleaving 
on  the  DRAM  chip,  in  which  one  bank  may  be  prepared  for  access  while  another  bank  is 
undergoing  access.  By  overlapping  cycles  in  this  way,  the  effective  cycle  time  is  reduced  for 
sequential  access,  because  consecutive  rows  will  always  reside  in  separate  memory  banks 
on  the  memory  chip  (this  characteristic  is  what  distinguishes  memory  interleaving  from 
simple  partitioning  of  memory  into  blocks).  Memory  interleaving  is  most  effective  for  the 
sequential  access  used  in  SLM  memories. 

Synchronous  DRAMs  output  a  serial  bit  stream  at  the  clock  frequency  when  operating 
in  burst  mode6.  The  sequence  of  events  culminating  in  this  output  begins  with  the  spec¬ 
ification  of  the  address  of  a  row  in  the  memory  array.  Once  a  single  access  time7  period 
has  elapsed,  all  of  the  data  in  that  row  has  been  written  into  the  output  shift  register, 
where  it  can  be  clocked  out  as  a  serial  bit  stream.  If  the  memory  input/output  bus  is 
several  bits  wide,  then  contiguous  rows  of  the  memory  are  simultaneously  accessed  in  this 
fashion.  Each  serial  line  has  its  own  shift  register,  which  is  synchronised  with  all  of  the 
other  shift  registers.  The  number  of  bits  in  a  chip  page8  is  typically  within  the  range  512 
to  2048  bits. 

Increased  synchronous  DRAM  functionality  is  provided  by  the  ‘nibbled  page’  architec¬ 
ture  of  Toshiba  Corporation  (Numata  et  al  [6]),  in  which  nonoverlapping  8  bit  sequences 
of  data  from  a  specific  row  are  able  to  be  accessed  at  random,  at  the  same  data  rate  as 
strictly  sequential  accesses  of  chip  pages.  This  facility  is  useful  for  high  speed  memory 
systems  that  require  random  access  capabilities,  but  for  SLM  memories  which  strictly 
adhere  to  sequential  access  operation,  it  is  superfluous. 

The  synchronous  DRAM  standard  includes  a  clock  disable  facility,  which  puts  the 
memory  chip  into  a  low  power  standby  mode.  In  this  mode,  the  system  clock  is  inter¬ 
mittently  allowed  to  propagate  cycles  to  refresh  the  DRAM  cells,  but  only  often  enough 
to  maintain  data  integrity.  The  expectation  is  that  eventually  synchronous  DRAMs  will 
have  a  ‘self  refresh’  capability  (already  available  on  premium  asynchronous  DRAMs),  in 

Sequential  Access:  Successive  bytes  accessed  from/directed  to  consecutive  storage  cells  on  the  chip. 
Analogous  to  page  or  burst  mode. 

3Random  Access:  Successive  bytes  accessed  from/directed  to  arbitrary  storage  cells  on  the  chip. 
Analogous  to  byte  mode. 

4  Cycle  Time:  Period  of  the  memory  access  cycle,  that  is,  the  minimum  possible  time  between  initiation 
of  one  memory  access  and  initiation  of  the  next. 

8  Memory  Bank:  An  independently  functional  subarray  of  memory  cells. 

6Burst  Mode:  Synonym  for  page  mode.  See  Footnote  9. 

7Access  Time:  Elapsed  time  between  initiation  of  memory  access  and  completion  of  the  byte  transfer 
phase  of  the  memory  access  cycle.  This  excludes  the  post-transfer  phase  of  the  memory  access  cycle,  in 
which  the  memory  resets  itself  to  become  available  for  the  next  access.  The  cycle  time  is  inclusive  of  both 
phases  of  the  memory  cycle,  hence  it  is  longer  than  the  access  time. 

8 Chip  Page:  A  row  of  memory  cells  in  the  memory  array. 


3 


DSTO-RR-0094 


which  memory  chip  circuitry  controls  the  refreshing  of  memory  cells,  without  assistance 
from  external  circuitry. 

Present  generation  synchronous  DRAMs  operate  at  clock  frequencies  of  100  MHz, 
with  unsubstantiated  claims  of  the  potential  of  attaining  up  to  500  MHz  performance  in 
the  future.  Input/output  bus  widths  are  8  bits,  giving  a  total  data  rate  of  800  Mb/s. 
The  intrinsic  access  time  for  a  chip  page  is  about  60  ns,  although  this  can  be  hidden  by 
interleaving  memory  chips,  as  demonstrated  in  Section  3.  Memory  capacity  is  presently 
16  Mb  per  chip. 

Synchronous  DRAM  communicates  at  high  serial  bit  rates  (100  Mb/s).  At  these  fre¬ 
quencies  signal  propagation  exhibits  guided  wave  effects  such  as  reflection,  mutual  cou¬ 
pling,  pulse  dispersion,  and  transmission  line  behaviour  in  general.  Specialised  packaging, 
interconnection  strategies  and  interface  circuitry  are  needed  to  accommodate  such  signal 
behaviour.  Printed  circuit  board  tracks  have  to  be  treated  as  transmission  lines  and  an¬ 
tennas.  Parasitic  capacitance  and  inductance  of  interface  electronics,  device  packaging 
and  waveguide  discontinuities  materially  affect  signal  quality,  and  so  must  be  minimised 
and  any  residual  taken  into  account  in  the  design.  Low  voltage  swing  interfaces  must  be 
used  to  increase  signalling  rate  and  reduce  power  consumption.  Signal  transmitters  must 
be  able  to  impress  signals  on  low  characteristic  impedance  (~100s  of  Qs)  interface  buses. 
As  the  clock  period  approaches  the  wave  propagation  time  over  the  system,  special  care 
must  be  taken  to  keep  the  signal  and  clock  phases  aligned  at  all  locations  in  the  system. 

To  ensure  effective  operation  at  their  high  clock  frequencies,  synchronous  DRAM  in¬ 
terfaces  are  low  voltage  swing  types,  such  as  GTL  or  CTT.  The  components  with  which 
the  synchronous  DRAMs  communicate,  such  as  the  SLM,  must  have  a  compatible  inter¬ 
face.  Since  electrically  addressed  SLMs  have  never  been  designed  with  low  voltage  swing 
interfaces,  this  represents  an  area  of  technological  risk  in  the  adoption  of  synchronous 
DRAM  memories  for  SLMs.  Packaging  that  is  suitable  for  high  frequency  operation,  be¬ 
ing  miniature  housings  that  reduce  the  effective  length  of  wiring  and  leads,  are  necessary 
for  synchronous  DRAMs.  Present  synchronous  DRAMs  are  available  as  TSOPs. 


3  A  synchronous  DRAM  SLM  memory 

In  this  section  I  shall  propose  a  synchronous  DRAM  based  SLM  memory  architecture 
that  is  suitable  for  use  in  an  optical  correlator.  The  proposed  memory  system  is  realis¬ 
able  using  current  generation  synchronous  DRAMs,  as  discussed  in  Section  2.  However, 
present  electrically  addressed  SLMs  use  TTL  or  CMOS  interface  levels.  To  interface  with 
a  synchronous  DRAM  constituted  memory,  an  SLM  that  operates  with  a  GTL  or  CTT 
interface  would  need  to  be  fabricated. 

The  specifications  of  the  memory  are,  to  a  large  extent,  dictated  by  the  specifications 
of  the  SLM  and  optical  correlator  operating  algorithm.  It  will  be  assumed  that  the  SLM 
has  512  x  512  pixels.  Two  cases  of  degree  of  modulation  will  be  concurrently  examined. 
One  is  binary  modulation,  and  the  other  is  hexadecimal  quantised  modulation.  Binary 
modulation  is  determined  by  a  single  bit;  hexadecimal  modulation  by  four  bits.  The 
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operating  algorithm  will  be  assumed  to  have  a  suite  of  1024  spatial  filters  available.  The 
capacity  required  of  the  SLM  memory  is  thus  256  Mb  for  binary  modulation,  and  1  Gb 
for  hexadecimal  modulation. 

Figure  1  (page  8)  displays  the  functional  block  diagram  of  a  256  Mb  SLM  memory;  the 
expansion  to  1  Gb  is  achieved  by  replacing  each  memory  chip  by  four  of  the  same  chip. 
The  synchronous  DRAM  components  are  presently  commercially  available.  Capacity  of 
the  memory  chips  is  16  Mb,  configured  as  2  MB  of  8-bit  bytes.  There  is  one  data  line  for 
each  bit  in  the  bytes,  thus  bytes  are  communicated  in  parallel  across  8  bit  data  buses.  A 
pair  of  memory  chips  shares  each  data  bus  in  an  interleaved  arrangement,  whereby  chips 
are  accessed  alternately,  with  sufficient  overlap  to  account  for  the  access  time  latency. 
Each  chip  access  reads  a  complete  chip  page.  Serial  data  rate  is  100  Mb/s,  and  this  is 
sustainable  for  the  duration  of  the  filter  load  into  the  SLM,  because  the  chip  interleaving 
allows  the  page  access  time  for  one  chip  to  occur  simultaneously  with  the  finish  of  the  other 
chip’s  page  output.  Effectively,  there  is  no  page  access  latency.  This  seamless  operation 
is  demonstrated  in  the  timing  diagrams  of  Figure  3,  which  are  discussed  below.  There 
are  eight  chip  pairs,  served  by  eight  8-bit  data  buses,  giving  a  total  bus  width  of  64  bits. 
Actually,  this  architecture  is  consistent  with  the  one  advocated  by  Salters  for  HDTV,  as 
described  in  Appendix  A. 

Since  the  memory  system  is  synchronous,  the  64  data  outputs  will  switch  simultane¬ 
ously.  On  occasions  when  the  net  balance  of  high  logic  states  at  the  output  of  one  memory 
chip  changes  as  a  result  of  the  logic  transitions,  transient  currents  will  flow  in  the  power 
supply  and  ground  lines.  The  transient  currents  may  induce  voltage  spikes  in  these  lines, 
and  voltage  ringing  in  the  output  lines.  As  a  precaution,  printed  circuit  board  design  tech¬ 
niques  to  mitigate  this  ‘ground  bounce,’  without  appreciably  slowing  signal  propagation, 
may  need  to  be  used.  This  endeavour  is  assisted  by  the  low  inductance  of  TSOP  pins;  one 
good  reason  why  synchronous  DRAMs  are  packaged  in  TSOPs. 

A  mechanism  by  which  incoming  data  is  distributed  to  the  pixel  array  of  the  SLM 
is  indicated  by  Figure  2  (page  9),  for  the  case  of  binary  modulation.  Each  of  the  64  bit 
lines  is  connected  to  an  8  bit  shift  register.  The  serial  data  fills  the  shift  register  in  eight 
clock  cycles,  at  which  instant  a  l-in-8  clock  parallel  latches  the  data  into  eight  consecutive 
column  buffers  connected  to  the  SLM  pixel  array.  Column  buffer  output  data  remains 
fixed  for  the  next  eight  clock  cycles,  which  must  be  long  enough  for  the  data  signals  to 
be  impressed  upon  their  corresponding  SLM  pixels.  Data  latching  into  the  column  buffers 
occurs  concurrently  with  the  last  serial  entry  into  the  shift  register,  so  there  should  be 
no  interruption  to  the  serial  data  flow,  which  is  entirely  consistent  with  the  operation  of 
the  synchronous  DRAM  in  the  fast  burst  mode.  For  the  currently  dominant  synchronous 
DRAM  clock  period  of  10  ns,  the  time  allocated  to  charging  the  pixel  capacitors  (note 
that  the  FLC  is  just  the  dielectric  medium  between  the  capacitor  electrodes)  in  each  row 
of  SLM  pixels  is  80  ns,  which  translates  into  write  time  allocation  for  the  whole  SLM  array 
of  40.96  /zs.  For  comparison,  current  generation  SLMs  with  512  pixel  rows,  are  physically 
limited  to  row  charging  times  of  43  ns,  using  polysilicon  row  lines  driven  from  both  ends; 
while  anticipated  future  SLMs  using  metal  row  lines  are  expected  to  achieve  row  charging 
times  of  about  1  ns  (Serati  [7]). 

A  similar  circuit  to  that  shown  in  Figure  2  would  be  suitable  in  the  case  of  hexadecimal 
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modulation,  in  which  four  bits  of  data  would  be  supplied  to  each  SLM  pixel.  Each  shift 
register  would  still  service  eight  consecutive  columns  of  pixels,  therefore  the  size  of  the 
shift  register  would  need  to  be  increased  to  32  bits.  It  would  take  32  clock  cycles  to 
serially  load  the  shift  register,  so  latch  enabling  of  the  32  column  buffers  would  need  to  be 
controlled  by  a  l-in-32  clock.  The  time  allocated  to  charging  the  pixel  capacitors  in  each 
row  of  SLM  pixels  is  now  320  ns,  giving  a  time  allocation  of  163.84  // s  for  the  whole  SLM 
array. 

Not  shown  in  Figure  2  are  the  digital  to  analogue  converters  (D/A)  that  must  be 
interposed  between  the  data  distribution  digital  circuitry  and  each  column  line,  in  the 
case  of  hexadecimal  modulation.  To  the  extent  that  the  D/A  response  time  is  much  less 
than  the  persistence  time  of  data  at  the  D/A  input,  the  inclusion  of  a  D/A  stage  in 
the  signal  path  should  not  have  a  material  impact  on  the  attainable  data  transfer  rate. 
8  bit  D/As  presently  being  incorporated  on  SLM  backplanes  have  response  times  of  about 
25  ns  (Serati  [7]),  which  compares  favourably  with  the  4  bit  input  data  persistence  time  of 
320  ns.  So  the  presence  of  column  D/As  in  the  case  of  hexadecimal  modulation,  although 
adding  complexity,  does  not  add  significant  delay. 

To  elucidate  the  data  transfer  process  during  the  loading  of  the  SLM,  timing  diagrams 
are  displayed  in  Figure  3  (page  10),  for  the  case  of  binary  modulation.  Periodic  inputs  to 
the  synchronous  DRAM  components  include  Clock,  Row  Access  Strobe  (RAS:  a  1-in-N 
clock,  where  N  is  the  number  of  bits  in  a  chip  page)  and  Column  Access  Strobe  (CAS: 
RAS  delayed  by  four  periods  of  Clock).  Periodic  inputs  to  the  synchronous  SLM  include 
Clock,  l-in-8  Clock  and  SLM  RAS  (Clock-^8). 

The  serial  entrance  of  data  into  the  SLM  shift  register  is  demonstrated  by  the  waveform 
‘Shift  Register  Cell  7  Data.’  Entrance  of  the  final  data  bit  is  accompanied  by  a  l-in-8  Clock 
pulse,  which  latches  the  data  into  the  SLM  column  buffers,  as  demonstrated  in  waveform 
‘Column  7  Pixel  Data.’ 

SLM  RAS  is  a  waveform  with  the  same  period  and  phase  as  l-in-8  Clock,  but  with 
a  50%  duty  cycle.  This  signal  is  applied  to  the  gates  of  the  pass  FETs  that  connect 
the  column  lines  to  one  of  the  SLM  pixel  capacitor  electrodes,  for  every  pixel  along  the 
selected  row.  Only  while  the  pass  transistor  gate  voltage  is  high  (for  n-channel  FETs)  will 
the  column  line  voltage  be  imposed  on  the  pixel  capacitor  electrode.  Both  the  row  and 
column  lines  are  relatively  long,  traversing  the  complete  width  and  height,  respectively,  of 
the  SLM  pixel  array.  And  both  of  these  lines  are  heavily  capacitively  loaded,  each  driving 
512  identical  devices  (FET  gate  for  the  row  line;  pixel  capacitor  for  the  column  line).  In 
combination,  these  two  factors  significantly  delay  signal  propagation  to  the  ends  of  the 
lines. 

To  ensure  that  a  high  state  on  the  pass  FET  gate  coincides  with  the  new  data  voltage 
at  the  source  of  the  pass  FET  (the  source  being  connected  to  the  column  line),  both  the 
column  data  and  the  SLM  RAS  must  persist  for  a  considerable  length  of  time.  The  l-in-8 
Clock  pulse  may  not  persist  long  enough  to  fulfil  the  role  of  SLM  RAS,  but  the  Clock-^8 
pulse  does,  so  the  latter  is  used  as  the  SLM  RAS.  Note  that  the  SLM  RAS  pulse  falling 
edge  must  have  propagated  to  the  end  of  the  row  of  pass  transistors  before  the  data  on 
the  column  lines  is  changed,  or  else  the  new  data  may  overwrite  the  old  data  (even  if  only 
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partially).  The  Clock-r-8  waveform  ensures  that  all  pass  transistors  have  been  turned  off 
before  new  data  is  introduced  onto  the  column  lines. 

The  timing  diagrams  of  Figure  3  correspond  to  a  time  interval  during  which  a  page 
of  data  from  Bank  #1  (i.e.  one  of  the  memory  chips  connected  to  a  particular  data  bus) 
finishes,  and  a  page  of  data  from  Bank  #2  (i.e.  the  other  memory  chip  connected  to  the 
same  data  bus)  starts.  The  interleaving  of  the  two  memory  banks  allows  the  Bank  #2 
Row  Access  and  Column  Access  Strobes  to  be  issued  while  Bank  #1  is  still  transmitting 
data.  This  gives  Bank  #2  enough  time  to  respond  to  the  data  request,  and  start  trans¬ 
mitting  data  as  soon  as  Bank  #1  has  stopped,  with  no  interruption  in  the  data  flow  to  the 
SLM.  Such  continuous  data  flow  is  not  possible  without  the  external  synchronous  DRAM 
interleaving  adopted  in  the  memory  system  architecture  of  Figure  1. 

Similar  timing  diagrams  to  those  of  Figure  3  would  also  apply  for  the  case  of  hexadec¬ 
imal  modulation,  but  with  a  l-in-32  clock  replacing  the  l-in-8  Clock,  and  the  SLM  RAS 
being  a  Clock-=-32  waveform. 


4  Improving  performance  by  a  cache  memory  on 

the  SLM 

If  extremely  high  memory  throughput  does  not  need  to  be  sustained  over  an  extended 
time  interval,  then  it  may  be  feasible  to  use  slower  conventional  DRAMs  for  main  memory, 
and  interpose  a  fast  cache  memory  capable  of  storing  one  filter  between  the  main  memory 
and  the  SLM  backplane.  While  the  liquid  crystal  is  responding  to  a  newly  applied  electric 
field,  followed  by  the  correlator’s  CCD  detector  array  being  read,  and  then  the  correlation 
signal  being  interpreted  by  a  processor,  a  new  filter  may  be  transferred  from  main  memory 
to  the  cache  memory,  ready  for  fast  access  when  the  correlator  is  ready.  There  is  only 
value  in  this  memory  scheme  if  the  spatial  filters  to  be  impressed  upon  the  SLM  are 
accessed  in  a  predetermined  order.  Unlike  the  synchronous  DRAM  system  of  Section  3, 
the  performance  of  SLM  memory  using  a  cache  deteriorates  with  increasing  occurrence  of 
choice  of  the  next  filter  based  upon  the  correlation  results  of  the  present  filter. 

Filter  transfer  from  main  memory  to  cache  memory  can  be  slow;  it  is  the  transfer  from 
cache  memory  to  the  SLM  backplane  that  should  be  as  fast  as  possible.  This  characteristic 
is  best  achieved  by  having  an  SRAM  cache  memory  fabricated  on  the  same  integrated 
circuit  as  the  SLM  backplane,  with  a  very  wide  data  bus  (preferably  equal  to  the  number 
of  columns  in  the  SLM  array)  connecting  the  cache  memory  with  the  SLM  array. 

A  512x512  pixel  SLM  fabricated  with  smallest  feature  sizes  of  2  fim  (which  is  fairly 
coarse  resolution  by  the  standards  of  presently  achievable  photolithography)  would  essen¬ 
tially  completely  fill  a  standard  integrated  circuit  die  of  200  mm2  area,  taking  into  account 
the  peripheral  circuitry  associated  with  the  SLM  aray  (Handschy  et  al  [8]).  However,  by 
the  Year  2000  it  is  predicted  that  the  smallest  attainable  feature  sizes  from  lithography  will 
have  shrunk  to  about  0.2  fi m,  while  standard  die  sizes  will  have  grown  to  over  400  mm2 
in  area  (Geppert  [9]).  A  1  Mb  (i.e.  512x512  pixels  x  4  bits/pixel)  SRAM  memory  ar¬ 
ray  fabricated  with  0.8  fj. m  minimum  feature  sizes  occupies  about  45  mm2  area,  ignoring 
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Figure  1:  Functional  block  diagram  of  an  interleaved  synchronous  DRAM  SLM  memory, 
suitable  for  use  in  an  optical  correlator.  Individual  data  buses  are  8  bits  wide.  Total 
memory  capacity  is  256  Mb. 
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Figure  2:  One  of  the  64  identical  serial-to-parallel  data  conversion  circuits  on  the  SLM 
chip,  for  the  case  of  binary  modulation.  Not  shown  is  the  low  voltage  swing  signal  receiver, 
that  detects  the  incoming  serial  bit  stream,  and  converts  it  to  CMOS  voltage  levels. 
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Figure  3:  Timing  diagrams  for  the  data  transfer  process  from  interleaved  synchronous 
DRAM  chips  to  the  SLM,  for  the  case  of  binary  modulation.  The  time  interval  that  is 
chosen  includes  the  uninterrupted  transition  of  data  flow  from  Bank  #1  to  Bank  #2. 
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peripheral  circuitry  (Prince  [1,  p.  432]).  What  this  mishmash  of  figures  emphatically 
demonstrates,  is  that  imminent  integrated  circuit  fabrication  technology  will  enable  a  one 
filter  sized  SRAM  cache  to  be  fabricated  on  the  same  chip  as  the  SLM. 

CMOS  based  SRAMs  have  random  access  times  as  short  as  8  ns  at  present.  Combined 
with  a  very  wide  data  path  from  the  SRAM  cache  to  the  SLM  pixel  array,  the  speed 
potential  of  SRAM  ensures  that  filter  transfer  from  cache  to  SLM  can  be  made  much 
quicker  than  the  already  impressively  quick  synchronous  DRAM  design  of  Section  3.  The 
value  of  the  filter  transfer  time  is  strongly  dependent  on  the  design  characteristics  of 
the  SRAM  cache  memory.  Specific  designs  will  not  be  considered  here,  although  general 
guiding  principles  will  be  addressed. 

Data  is  written  to  SRAM  cache  from  DRAM  main  memory  operating  in  page  mode9 
with  a  cycle  time  of  as  low  as  40  ns  for  asynchronous  DRAM.  This  cycle  time  is  not 
sustainable  across  pages,  since  at  the  start  of  each  new  page,  the  applicable  DRAM  access 
time  is  that  for  byte  mode,  which  is  longer  than  that  for  page  mode.  An  indication  of 
the  quickening  of  memory  access  in  page  mode  as  opposed  to  byte  mode,  is  conveyed  by  a 
comparison  of  performance  in  the  two  modes  for  a  1  Mb  DRAM  from  Motorola  (Prince  [1, 
p.  62]) — access  time:  25  ns  for  page  access  and  85  ns  for  byte  access;  cycle  time:  50  ns  for 
page  access  and  165  ns  for  byte  access. 

For  a  typical  DRAM  page  length  of  512  bits,  and  a  currently  maximum  DRAM  data  bus 
width  of  16  bits,  a  512x512  pixel  binary  filter  will  require  at  least  656  ps  to  be  transferred 
from  DRAM  main  memory  to  cache;  the  corresponding  figure  for  a  hexadecimal  filter  is 
2625  fis.  However,  these  cache  memory  write  times  reduce  inversely  with  increasing  data 
bus  width.  And  32  bit  DRAM  data  buses  will  be  commercially  available  by  the  Year  2000, 
with  64  bit  widths  available  by  2003.  Furthermore,  several  DRAM  chips  can  be  mounted 
and  connected  in  parallel  to  behave  as  a  single  composite  DRAM  component  having  a 
wide  data  bus.  This  technique  is  already  well  established  in  the  ubiquitous  single  in  line 
memory  modules  (SIMMs)  used  in  personal  computer  main  memories.  It  is  probably 
possible  to  use  a  data  bus  sufficiently  wide  to  allow  loading  of  the  cache  within  the  time 
span  of  a  single  correlation. 

Although  the  DRAM  main  memory  system  could  have  a  two  bank  interleaved  configu¬ 
ration,  thus  eliminating  the  read  latency  at  the  beginning  of  each  page,  this  is  of  marginal 
benefit  for  typical  chip  page  lengths.  Unless  the  page  length  of  the  DRAM  is  unusually 
small,  or  implementing  memory  interleaving  has  no  complexity  penalties,  it  seems  best 
not  to  incorporate  memory  interleaving. 

Since  filter  pixels  are  transferred  in  an  invariant  and  known  sequence,  both  the  DRAM 
main  memory  and  SRAM  cache  can  be  serial  access  memories,  to  reduce  complexity  and 
cost.  Performance  will  not  be  compromised  in  any  way  by  the  reduced  flexibility  of  the 
memory,  because  serial  access  always  is  the  fastest  mode  of  memory  access,  even  when  the 
memory  does  have  random  access  capabilities.  The  serial  access  architecture  is  essentially 
one  very  long  shift  register  for  every  parallel  output  line,  and  is  used  in  the  frame  buffers 

9Page  Mode:  A  fast  method  of  consecutively  accessing  a  whole  chip  page  of  memory.  The  page 
address  is  specified  only  at  the  beginning,  and  is  retained  for  the  duration  of  the  operation,  instead  of 
being  respecified  for  each  byte,  as  in  the  slower  byte  mode. 
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mentioned  in  Appendix  B. 

Data  transfer  to  the  SLM  by  the  synchronous  DRAM  design  of  Section  3  occurs  at  very 
fast  bit  rates  for  brief  time  periods  separated  by  extended  periods  of  no  communication.  In 
contrast,  the  present  on-SLM  cache  memory  concept  transfers  the  same  data  at  a  lower  bit 
rate  over  extended  time  intervals,  separated  by  inactive  intervals  of  shorter  length  than  the 
synchronous  DRAM  design.  Unlike  the  synchronous  DRAM  design,  the  intercomponent 
bit  rate  in  the  on-SLM  cache  memory  concept  is  low  enough  to  be  accommodated  by 
conventional  interface,  packaging  and  printed  circuit  board  technologies.  Detailed  design 
of  the  on-SLM  cache  memory  system  is  thus  simplified  in  some  important  respects. 

In  the  case  of  an  exceptionally  responsive  SLM  from  the  perspective  of  pixel  charging 
times,  it  may  be  tempting  to  consider  fabricating  a  bipolar  SRAM  cache.  Certainly, 
BJT  memories,  in  particular  ECL,  have  very  fast  speeds;  but  they  also  have  high  power 
consumption  and  low  device  density.  More  complicated  and  expensive  semiconductor 
processing  is  required  to  fabricate  both  BJTs  (for  the  cache)  and  FETs  (for  the  SLM 
backplane)  on  the  same  silicon  wafer.  However,  combining  BJTs  and  FETs  on  the  same 
chip  is  a  well  established  process,  being  the  foundation  of  biCMOS  technology.  A  64  kb 
ECL  SRAM  with  access  times  of  ~3  ns,  or  a  5  kb  ECL  SRAM  with  access  times  of  <1  ns, 
represent  the  present  capabilities  of  bipolar  semiconductor  memories. 

Extra  semiconductor  processing  steps  may  be  needed  for  integrating  an  SRAM  cache 
onto  the  SLM  backplane  chip.  For  an  nMOS  SRAM,  an  implantation  step  will  have  to 
be  introduced  to  fabricate  the  depletion  mode  transistors.  For  a  CMOS  SRAM,  an  n- 
well  diffusion  step  has  to  be  introduced  into  the  fabrication  process,  followed  by  different 
diffusion  steps  for  the  source/drain  regions  of  the  n-channel  and  p-channel  MOSFETs. 
None  of  these  steps  are  needed  for  the  SLM  pixel  array.  However,  the  peripheral  circuitry 
that  would  be  present  regardless  of  any  cache  memory  may  need  these  steps,  so  the  extent 
to  which  inclusion  of  cache  SRAM  circuitry  increases  fabrication  complexity  is  uncertain. 


5  Conclusion 


Figures  quoted  in  Section  3  indicate  that  a  well  designed  present  generation  SLM 
backplane  is  capable  of  handling,  with  reserve,  a  notional  high  performance  synchronous 
DRAM  memory  system.  And  it  is  inevitable  that  SLM  backplanes  with  metal  row  lines, 
and  much  increased  writing  speeds,  eventually  will  become  available,  thus  making  feasible 
SLM  memory  systems  seem  even  more  inadequate.  Such  reasoning  probably  exaggerates 
the  speed  potential  of  the  SLM,  because  it  ignores  the  FLC  switching  time  in  response  to 
a  transition  of  the  pixel  capacitor  voltage.  Typical  FLC  switching  times  in  current  SLMs 
are  about  50  ps  (McKnight  ei  al  [10]),  although  FLC  materials  that  allow  much  faster 
switching  are  already  available. 

On  balance,  it  seems  that  a  state  of  the  art  active  backplane  FLC  SLM  will  outper¬ 
form  any  evolutionary  advance  in  DRAM  memory  systems.  The  favourable  aspect  of  this 
circumstance  is  that  any  effort  expended  in  increasing  memory  speed  will  be  rewarded  by 
an  almost  proportionate  increase  in  SLM  system  speed.  I  have  proposed  two  enhanced 
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performance  SLM  memory  architectures  in  this  report,  and  based  upon  the  evidence  that  I 
have  presented,  we  accept  that  the  performance  improvements  of  these  memories  translate 
into  worthwhile  improvements  in  SLM  filter  update  rate.  Faster  memories  certainly  are 
conceivable,  but  these  are  somehow  exotic  (in  architecture,  devices,  materials,  signalling 
or  packaging),  and  probably  not  serious  candidates  for  SLM  memories. 
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Appendix  A 

DRAMs  for  digital  television 


Because  of  the  highly  specialised  nature  of  optical  correlators,  they  will  probably  never 
be  a  driving  influence  on  memory  technology.  Instead,  they  will  probably  derive  their 
memory  technology  from  mass  market  applications. 

High  definition  television  (HDTV),  once  it  becomes  a  commercial  reality,  is  antic¬ 
ipated  to  be  the  largest  consumer  of  DRAM — larger  than  all  types  of  computing  and 
digital  telecommunications.  And  HDTV  will  be  a  demanding  driver  of  memory  tech¬ 
nology,  since  it  will  require  higher  memory  throughput  than  most  computers  and  their 
displays.  Consider  the  example  of  converting  the  field  refresh  rate  on  conventional  colour 
televisions  from  50  Hz  to  100  Hz,  to  reduce  screen  flicker.  The  combined  input  and  output 
data  throughput  is  372  Mb/s.  Based  upon  the  proposed  standards  for  HDTV,  HDTV 
will  require  a  combined  input  and  output  data  throughput  of  1950  Mb/s.  The  amount  of 
memory  required  will  be  over  4  MB. 

For  elementary  processing  of  HDTV  images,  memory  is  only  accessed  sequentially, 
just  like  optical  correlators.  Accordingly,  it  is  reasonable  to  expect  that  any  specialised 
memory  components  developed  for  HDTV  also  will  be  well  suited  for  use  as  the  SLM 
memory  in  optical  correlators.  Salters  [5]  opines  that  the  HDTV  memory  requirement  will 
be  fulfilled  by  using  multiple  memory  chips  in  parallel,  with  a  total  bus  width  of  64  bits. 
Individual  memory  chips  would  need  to  be  accessed  in  page  mode  to  achieve  the  necessary 
data  rate.  The  same  design  principles  would  need  to  be  adopted  for  SLM  memory  in 
optical  correlators,  as  exemplified  by  the  memory  design  in  Section  3. 

In  summary,  HDTV  is  a  prospective  mainstream  application  that  will  direct  the  future 
of  semiconductor  memory  technology,  and  it  is  fortunate  that  the  memory  requirements 
of  HDTV  are  similar  to  those  of  SLMs  in  optical  correlators.  It  should  prove  beneficial  for 
optical  correlator  designers  to  monitor  closely  the  ongoing  development  of  HDTV  memory, 
with  the  intention  of  adapting  it  to  their  particular  application. 
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Appendix  B 
Survey  of  fast  DRAMs 


Many  different  types  of  high  speed  DRAM  are  available,  distinguished  either  by  their 
fabrication  technology,  chip  architecture,  or  functionality  as  system  components.  The 
performance  improvement  offered  by  fast  DRAMs  should  be  assessed  against  the  best 
random  access  speed  available  from  standard  asynchronous  DRAMs;  being  40  ns  access 
time  and  80  ns  cycle  time. 

Hitachi  has  achieved  17  ns  access  time  from  a  DRAM  fabricated  using  a  biCMOS 
process;  in  which  the  memory  cell  array  is  fabricated  in  CMOS,  but  the  peripheral  circuitry 
on  the  chip  is  fabricated  in  ECL.  biCMOS  is  more  complex  and  expensive  than  standard 
CMOS.  This  circuit,  which  has  a  conventional  DRAM  architecture,  is  representative  of  the 
fabrication  technology  approach  to  performance  improvement;  an  approach  more  common 
in  SRAMs  than  DRAMs.  Most  schemes  of  DRAM  performance  improvement  concentrate 
on  circuit  design  innovation. 

One  established  high  speed  DRAM  architecture  is  the  video  DRAM,  which  is  used 
for  controlling  computer  displays.  Video  DRAMs  write  the  contents  of  a  complete  row 
of  DRAM  cells  in  parallel  into  the  cells  of  a  shift  register.  A  serial  data  stream  is  then 
clocked  out  of  the  shift  register.  This  operation  is  well  suited  to  an  SLM  memory.  A 
random  access  write  can  be  simultaneously  made  to  the  memory  array.  This  latter  facility 
is  of  no  benefit  to  an  optical  correlator,  in  which  the  filters  are  invariant.  Video  DRAM 
circuitry  occupies  up  to  50%  more  area  on  the  silicon  die  than  conventional  DRAMs  of 
the  same  bit  capacity.  The  extra  functionality  and  performance  of  video  DRAM  comes  at 
the  expense  of  increased  package  size  and  pin  count,  higher  power  dissipation,  and  greater 
cost. 

Video  DRAMs  are  faster  than  conventional  DRAMs,  but  not  as  fast  as  emerging 
generation  high  performance  DRAMs.  The  present  frontier  of  video  DRAM  capability 
is  exemplified  by  a  Texas  Instruments  video  DRAM,  that  has  a  capacity  of  1  Mb,  and 
outputs  eight  serial  bit  streams,  each  uninterrupted  at  70  Mb/s.  Uninterrupted  operation 
at  this  rate  is  achieved  by  pipelining  operations,  and  on-chip  memory  bank  interleaving. 

Frame  buffer  DRAMs  are  specialty  memories  developed  for  video  display  applications, 
in  particular  digital  television.  They  have  simplified  chip  interfaces  that  only  allow  serial 
input  and  output.  Frame  buffers  have  similar  serial  output  performance  as  video  DRAMs, 
but  at  a  lower  cost,  because  of  the  absence  of  random  access  capability.  The  filter  access 
requirements  of  an  optical  correlator  are  only  serial,  so  that  frame  buffers  would  be  just 
as  suitable  for  the  SLM  memory  as  video  or  conventional  DRAMs.  A  particularly  capable 
frame  buffer  is  a  16  Mb  component  from  Toshiba,  that  simultaneously  outputs  four  serial 
bit  streams  of  length  2048  bits  each,  at  a  rate  of  100  Mb/s.  This  type  of  component  merits 
serious  consideration  for  use  in  the  SLM  memory  of  an  optical  correlator. 

Emerging  high  performance  DRAM  architectures  include  the  Enhanced  DRAM  from 
Ramtron,  and  Cache  DRAM  from  Mitsubishi  Electric.  Both  of  these  architectures  are 
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based  around  the  presence  of  a  fast  SRAM  cache  on  the  DRAM  main  memory  chip, 
with  a  very  wide  bus  connecting  main  and  cache  memories.  Since  the  bus  is  entirely  on 
the  integrated  circuit,  it  allows  very  high  speed  signalling,  due  to  its  very  low  RC  time 
constant. 

The  two  memories  are  quite  distinct  in  their  performance  and  operational  character¬ 
istics.  Present  Enhanced  DRAMs  have  burst  mode  cycle  times  of  15  ns  on  a  sequence  of 
2048  bits,  and  interface  signals  are  CMOS/TTL  compatible.  They  operate  asynchronously, 
just  like  conventional  DRAMs,  and  are  entirely  suitable  for  direct  substitution  in  memory 
systems  designed  for  conventional  DRAM.  Present  Cache  DRAMs  have  10  ns  cycle  times 
on  sequences  of  64  bits  extracted  from  cache  storage,  and  have  CMOS/TTL  compatible 
interfaces.  They  operate  synchronously,  unlike  conventional  DRAMs,  and  they  require 
memory  system  designs  that  are  very  different  from  those  for  conventional  DRAMs.  The 
synchronisation  protocol  of  the  proprietary  Cache  DRAM  is  not  compatible  with  that  for 
the  industry  standard  sychronous  DRAM  discussed  in  Section  2. 

The  sophistication  of  the  caches  of  the  Enhanced  DRAM  and  Cache  DRAM  is  beneficial 
for  fast  random  access  of  the  memory,  but  superfluous  for  the  sequential  access  undertaken 
by  SLM  memory.  Neither  of  these  emergent  DRAM  technologies  seem  particularly  suitable 
for  use  in  SLM  memories. 

The  Rambus  DRAM  is  a  system  wide  architecture  developed  and  patented  by  Rambus. 
As  well  as  specifying  operational  characteristics  of  the  DRAM  circuitry,  it  also  defines 
the  interface  electrical  characteristics,  bus  structure,  communication  protocol  and  control 
hierarchy.  The  technology  has  been  licensed  by  Toshiba,  Fujitsu  and  NEC.  Rambus 
DRAM  operates  synchronously,  but  not  according  to  the  protocol  of  the  industry  standard 
synchronous  DRAMs  considered  in  Section  2. 

Rambus  DRAM  sustains  an  impressive  cycle  time  of  2  ns  on  sequences  of  256  bits. 
However,  the  benefits  of  the  rapid  serial  data  rate  of  Rambus  are  somewhat  diminished 
by  the  fact  that  the  Rambus  bus  is  specified  to  be  9  bits  wide,  and  there  is  no  scope  for 
increasing  data  throughput  by  using  multiple  buses.  Further  limiting  the  effective  data 
throughput,  is  the  fact  that  each  transfer  of  a  block  of  256  bytes  is  preceded  by  the  transfer 
of  header  bytes.  Furthermore,  the  bus  is  shared  by  data,  addresses  and  control  signals, 
causing  a  ‘staccato’  data  flow.  Consequently,  the  potential  data  throughput  of  Rambus 
DRAM  systems  is  not  as  impressive  as  the  serial  data  rate  would  superficially  suggest  is 
possible. 

Interface  levels  in  Rambus  memory  systems  are  low  voltage  swing  (0.6  v)  signals  on  a 
terminated  bus,  to  accommodate  the  high  speed  signalling.  Present  Rambus  DRAMs  have 
4.5  Mb  storage  capacities.  The  Rambus  architecture  has  provision  for  several  blocks  of 
DRAM,  but  just  one  master  may  fit  on  the  bus,  and  this  master  must  control  the  Rambus 
system  and  act  as  the  gateway  to  the  remainder  of  the  system.  In  the  context  of  SLM 
memory,  the  SLM  would  be  external  to  the  Rambus  system,  so  it  will  receive  its  data 
via  a  circuitous  route,  with  the  potential  for  a  communication  bottleneck  as  the  Rambus 
master  distributes  its  data  to  the  SLM.  The  Rambus  architecture  is  not  well  suited  to  the 
role  of  SLM  memory,  primarily  because  of  the  inflexibility  in  bus  width. 

RamLink  hardware  consists  of  DRAMs  with  special  interfaces  connected  in  a  ring 
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topology  using  LVDS  links.  There  is  an  expectation  among  some  experts  that  rings  will 
eventually  displace  buses  as  the  communication  methodology  in  future  computers.  Ram- 
Link  uses  a  packet  based  protocol  in  distributing  data  around  the  ring.  Although  RamLink 
components  are  not  yet  commercially  available,  the  RamLink  concept  seems  to  have  credi¬ 
bility,  because  of  the  support  of  the  IEEE  Computer  Society  for  its  accession  to  the  status 
of  universal  standard. 

The  initial  version  of  RamLink  is  expected  to  have  a  2  ns  cycle  time  for  sequences 
(packets)  of  64  8-bit  bytes,  with  the  packet  delimited  by  a  6  byte  header  and  2  byte 
tail.  DRAM  rings  are  connected  to  a  controller,  which  also  acts  as  the  gateway  to  the 
remainder  of  the  system,  which  in  the  present  context  includes  the  SLM.  Each  DRAM 
in  the  ring  intoduces  a  delay  in  the  data  flow  of  at  least  one  cycle  time.  Apart  from  the 
replacement  of  a  bus  topology  by  a  ring  topology,  RamLink  is  similar  to  Rambus  from  a 
system  viewpoint,  and  neither  seem  to  be  propitious  solutions  for  SLM  memory. 


17 


DSTO-RR-0094 


References 

1.  B.  Prince,  Semiconductor  Memories — A  Handbook  of  Design,  Manufacture  and  Appli¬ 
cation,  2nd  ed.,  Wiley,  Chichester  (1991). 

2.  R.  Comerford  and  G.  F.  Watson,  “Memory  catches  up,”  IEEE  Spectrum,  October 
1992,  34-57. 

3.  B.  Prince,  “Memory  in  the  fast  lane,”  IEEE  Spectrum,  February  1994,  38-41. 

4.  U.  Efron  (ed.),  Spatial  Light  Modulator  Technology — materials,  devices,  and  applica¬ 
tions,  Marcel  Dekker,  New  York  (1995). 

5.  R.  Salters,  “Fast  DRAMs  for  sharper  TV,”  IEEE  Spectrum,  October  1992,  40-2. 

6.  K.  Numata,  Y.  Oowaki,  Y.  Itoh,  T.  Hara,  K.  Tsuchida,  M.  Ohta,  S.  Watanabe  and 
K.  Ohuchi,  “New  nibbled-page  architecture  for  high-density  DRAMs,”  IEEE  J.  Solid- 
State  Circuits  24,  900-4  (1989). 

7.  S.  Serati,  Boulder  Nonlinear  Systems,  pivate  communication  (1996). 

8.  M.  A.  Handschy,  T.  J.  Drabik,  L.  K.  Cotter  and  S.  D.  Gaalema,  “Fast  ferroelectric- 
liquid-crystal  spatial  light  modulator  with  silicon-integrated-circuit  active  backplane,” 
SPIE  Proc.  1291,  Optical  and  Digital  GaAs  Technologies  for  Signal-Processing  Appli¬ 
cations,  158-64  (1990). 

9.  L.  Geppert,  “Technology  1996:  Solid  State,”  IEEE  Spectrum,  January  1996,  51-5. 

10.  D.  J.  McKnight,  K.  M. Johnson  and  R.  A.  Serati,  “256x256  liquid-crystal-on-silicon 
spatial  light  modulator,”  Appl.  Opt.  33,  2775-84  (1994). 


18 


DSTO-RR-0094 


Glossary 


BJT  Bipolar  Junction  Transistor 

The  ‘original’  transistor  structure,  consisting  of  a  layer  of  n-type  semiconductor 
(the  emitter),  forming  a  junction  with  a  thin  lightly  p-doped  layer  (the  base),  the 
other  side  side  of  which  forms  a  junction  with  another  n-type  layer  (the  collector). 
This  describes  an  npn  transistor;  pnp  transistors  are  also  possible.  BJTs  have  three 
distinct  modes  of  operation:  cut-off  mode,  in  which  negligible  current  flows  through 
all  three  electrodes;  active  mode,  in  which  the  large  charge  flow  from  the  emitter 
to  the  collector  is  controlled  by  the  emitter-base  voltage,  but  in  which  the  base 
electrode  current  is  relatively  insignificant  (BJTs  in  linear  circuits  operate  in  active 
mode);  and  saturation  mode,  in  which  significant  current  flows  through  all  three 
electrodes,  and  the  base  region  becomes  saturated  with  minority  carriers  (electrons 
in  npn  transistors). 

CCD  Charge  Coupled  Device 

A  one  dimensional  array  of  devices  formed  by  appropriate  patterning  of  surface 
and  buried  electrodes  on  a  semiconductor  die.  By  applying  certain  voltages  to  the 
electrodes,  an  array  of  potential  wells  is  formed,  in  which  charge  from  photovoltaic 
detectors  accumulates.  On  correct  cycling  of  the  electrode  potentials,  charge  is 
transferred  between  adjacent  wells.  The  analogue  charge  ‘spilling’  out  of  the  end  of 
the  CCD  line  array  may  be  sensed  and  converted  into  a  digital  representation  of  the 
photodetector  signal.  CCDs  are  loaded  with  data  in  parallel  from  the  photodetectors, 
but  the  data  only  can  be  read  out  serially.  Planar  CCD  arrays  usually  are  fabricated 
as  several  line  arrays  stacked  side  by  side. 

CMOS  Complementary  Metal  Oxide  Semiconductor 

An  integrated  circuit  technology  in  which  the  only  transistors  are  n-channel  and  p- 
channel  enhancement  MOSFETs,  usually  appearing  in  pairs.  CMOS  logic  gates  only 
consume  power  when  switching  state,  have  wider  noise  margins  than  nMOS,  and  a 
better  tolerance  to  voltage  and  temperature  variation  than  nMOS.  These  properties 
usually  render  CMOS  the  favoured  integrated  circuit  technology. 

CTT  Centre-tap  terminated 

An  adaptable,  fast,  low  voltage  swing  interface  standard.  For  unterminated  lines, 
CTT  has  the  same  characteristics  as  standard  CMOS  interfaces,  so  it  is  downward 
compatible  with  low  voltage  CMOS  and  TTL  levels.  For  terminated  lines,  the  CTT 
interface  automatically  adjusts  to  0.8  v  low  voltage  swing  operation. 

DRAM  Dynamic  Random  Access  Memory 

Semiconductor  memory  with  an  extremely  small  cell  size.  Usually,  a  bit  storage 
cell  is  a  capacitor  that  stores  charge  in  one  logic  state,  and  a  transistor  switch  that 
connects  the  cell  to  the  periphery  of  the  memory  array.  The  charge  on  the  capacitor 
leaks  away  over  time,  hence  it  must  be  periodically  refreshed  by  specialised  circuitry. 
Accessing  DRAM  is  inherently  slower  than  accessing  SRAM.  In  the  case  of  writing, 
the  DRAM  cell  loads  the  line  driver  with  its  large  cell  capacitor,  while  the  equivalent 
load  in  an  SRAM  cell  is  the  small  FET  gate  capacitance.  In  the  case  of  reading, 
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the  DRAM  cell  capacitor  must  partially  discharge  with  a  long  RC  time  constant, 
while  the  corresponding  time  constant  for  an  SRAM  cell  is  determined  by  the  small 
FET  capacitance  components.  Additionally,  DRAMs  must  be  periodically  refreshed, 
and  the  cell  capacitor  charge  restored  after  reading,  neither  of  which  are  needed  by 
SRAMs.  These  factors  combine  to  give  SRAM  an  intrinsic  speed  advantage  over 
DRAM. 

ECL  Emitter  Coupled  Logic 

Digital  integrated  circuit  technology  using  BJTs  connected  as  differential  pairs  to 
form  logic  gates.  Transistors  switch  between  cut-off  and  active  modes  in  logic  tran¬ 
sitions.  By  avoiding  saturation,  ECL  achieves  the  ultimate  switching  speed  possible 
from  its  devices.  ECL  has  niche  applications  in  the  highest  speed,  low  density  in¬ 
tegrated  circuit  domain.  ECL  has  lower  density  and  higher  power  dissipation  than 
TTL  and  especially  MOS  technologies.  The  very  low  output  impedance  of  ECL 
gates  gives  them  particularly  good  output  drive. 

FLC  Ferroelectric  Liquid  Crystal 

FLCs  are  formed  when  chiral  liquid  crystal  molecules  agregate  in  the  smectic  C* 
phase.  These  materials  are  ferroelectric,  that  is,  they  have  a  finite  spontaneous 
polarisation.  They  are  also  birefringent  because  of  the  long  range  orientational 
order  of  the  long  molecules.  In  equilibrium  the  polarisation  vector  tends  to  align 
along  the  external  electric  field  direction  (minimum  potential  energy  state),  and  this 
determines  the  orientation  of  the  molecules,  which  in  turn  determines  the  orientation 
of  the  optical  axes  of  the  material  anisotropy,  which  in  turn  determines  the  optical 
path  length  for  light  propagating  through  the  material,  which  determines  the  phase 
of  the  light  ray  at  the  output  surface  of  the  FLC  layer.  By  this  mechanism  FLCs 
fulfil  the  role  of  the  light  modulatiing  layer  of  SLMs.  In  practical  SLMs,  the  only 
possible  electric  field  direction  changes  are  reversals,  so  there  are  usually  only  two 
accessible  FLC  states  (hysteresis  in  state  transitions  prevents  the  FLC  from  entering 
other  states  without  deliberate  forcing),  and  FLC  SLMs  are  inherently  binary  light 
modulators. 

Gb  gigabit 

1  gigabit  =  230  bits  =  1073741824  bits. 

GTL  Gunning  Transceiver  Logic 

A  fast,  low  power,  low  voltage  swing  interface  standard  with  a  0.8  v  voltage  swing. 
GTL  drivers  typically  dissipate  about  10  mW,  which  compares  favourably  with  fast 
ECL  drivers  that  consume  about  125  mW.  The  exceptionally  low  power  consumption 
of  GTL  makes  it  feasible  to  incorporate  hundreds  of  GTL  drivers  and  receivers  on 
the  one  integrated  circuit. 

LVDS  Low  Voltage  Differential  Signalling 

A  fast,  low  power,  low  noise,  low  voltage  swing  interface  standard,  with  a  0.25  v 
voltage  swing.  Signals  propagate  along  a  pair  of  closely  coupled  conductors,  as  in 
conventional  transmission  lines,  and  this  provides  enhanced  noise  immunity.  The 
differential  signal  is  defined  by  the  potential  difference  between  the  two  conductors, 
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not  a  single  conductor  and  a  space  and  time  variant  ‘ground’  potential.  Up  to  1  v 
of  common  mode  noise  on  the  two  conductors  can  be  tolerated  without  degradation 
of  the  signal.  LVDS  requires  a  pair  of  pins  for  each  interface  port. 

Mb  megabit 

1  megabit  =  220  bits  =  1048576  bits. 

MB  megabyte 

1  megabyte  =  220  bytes  =  1048576  bytes. 

MOSFET  Metal  Oxide  Semiconductor  Field  Effect  Transistor 

A  transistor  in  which  the  current  flow  between  the  source  and  the  drain  is  influenced 
(i.e.  switched)  by  the  voltage  applied  to  the  interposed  gate  electrode.  Varieties  are 
n-channel,  in  which  the  current  is  composed  of  only  electrons;  and  p-channel ,  in 
which  the  current  is  composed  of  only  holes.  MOSFETs  may  also  be  classified  as 
enhancement  type,  where  a  gate  voltage  of  opposite  polarity  to  the  current  car¬ 
riers  must  be  applied  to  make  the  FET  conductive;  and  depletion  type,  which  are 
conductive  until  a  gate  voltage  of  the  same  polarity  as  the  current  carriers  is  applied. 

nMOS  n-channel  Metal  Oxide  Semiconductor 

An  integrated  circuit  technology  in  which  the  only  transistors  are  n-channel  en¬ 
hancement  and  depletion  MOSFETs.  nMOS  logic  gates  persistently  consume  power 
when  the  output  state  is  low.  nMOS  technology  attained  maturity  before  CMOS 
technology,  but  now  it  seems  that  nMOS  is  inexorably  being  replaced  by  CMOS. 

SLM  Spatial  Light  Modulator 

A  planar  device  whose  output  is  a  spatially  coherent  light  wavefront.  A  specific 
spatial  pattern  of  amplitude  or  phase  variation  is  imposed  on  the  extended  wave- 
front  by  the  SLM.  In  an  optical  correlator  this  pattern  represents  either  the  image 
(1st  SLM)  or  the  filter  (2nd  SLM).  Optically  addressed  SLMs  have  a  photosensitive 
layer  superimposed  on  the  light  modulating  layer,  and  the  photosensitive  layer  is 
illuminated  by  an  image,  which  is  coupled  through  to  the  light  modulating  layer  as 
the  required  pattern.  They  usually  have  continuous  surface  properties.  In  principle 
the  dgeree  of  modulation  is  continuously  variable,  but  in  practice  the  modulating 
mechanism  may  be  quasi-bistable,  yielding  effectively  binary  modulation.  Adressing 
is  parallel,  since  the  pattern  emerges  at  all  locations  in  unison.  Electrically  addressed 
SLMs  have  electronic  circuitry  in  intimate  contact  with  the  light  modulating  layer. 
This  structure  is  achieved  by  either  depositing  the  modulating  layer  on  a  monolithic 
integrated  circuit,  or  depositing  thin  film  devices  and  circuitry  on  the  modulating 
layer.  The  electronic  circuit  essentially  is  an  array  of  memory  cells  that  each  impose 
a  localised  voltage  on  the  light  modulating  layer.  Spatial  variation  of  the  applied 
voltages  defines  the  pattern.  The  surface  state  is  necessarily  pixelated,  according  to 
the  size  and  separation  of  the  memory  cells.  Addressing  is  partially  serial,  as  not  all 
memory  cells  can  be  accessed  at  the  same  time.  Usually  each  memory  cell  has  a  one 
bit  capacity,  giving  intrinsically  binary  modulation  properties. 

SRAM  Static  Random  Access  Memory 

Semiconductor  memory  in  which  each  cell  is  a  bistable  circuit,  such  as  two  inverters 
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connected  in  a  ring  (known  as  a  flipflop).  In  nMOS  SRAMs,  bit  storage  cells  have 
device  counts  of  at  least  six;  four  transistors  for  the  flipflop,  and  two  transistors  for 
the  access  switches  to  the  complementary  outputs  of  the  flipflop.  SRAM  cells  are 
typically  four  times  larger  than  DRAM  cells — this  accounts  for  the  fourfold  density 
advantage  that  DRAM  has  over  SRAM.  The  logic  state  in  the  SRAM  cell  is  retained 
permanently  while  the  power  is  on;  there  is  no  need  for  refreshing.  Thus,  SRAMs 
are  always  available  for  accessing,  unlike  DRAMs,  which  can  not  be  accessed  during 
the  periodic  refresh  cycles.  In  nMOS  SRAMs,  there  is  always  one  inverter  in  the 
flipflop  that  is  conducting  (the  one  with  high  input  and  low  output).  Accordingly, 
all  SRAM  cells  consume  significant  power  at  all  times,  unlike  DRAM  cells,  which 
only  dissipate  power  as  a  result  of  the  slow  leakage  of  charge  from  their  capacitor. 

TSOP  Thin  Small  Outline  Package 

A  type  of  package  for  integrated  circuits,  which  is  conducive  to  high  speed  opera¬ 
tion.  The  small  size  of  TSOPs  allows  short  connecting  leads  between  package  pins 
and  circuit  pads,  which  results  in  small  pin  inductances,  which  in  turn  reduces  the 
propensity  for  voltage  spikes  and  ringing  on  fast  signal  transitions. 

TTL  Transistor-Transistor  Logic 

Digital  integrated  circuit  technology  using  BJTs  as  active  devices.  Transistors  switch 
between  cut-off  and  saturation  modes  in  logic  transitions.  The  considerable  base 
charging  time  in  the  transition  to  saturation  mode  is  the  limiting  factor  in  switching 
speed.  TTL  voltage  levels  for  logic  states  have  become  a  common  interface  voltage 
standard  for  digital  systems,  even  when  the  component  integrated  circuits  use  a 
technology  other  than  TTL. 
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